A Parallel Decoder Algorithm for Low Density Parity Check Convolutional Codes for the XInC Multi-threaded Microprocessor
نویسندگان
چکیده
Convolutional low density parity check codes (LDPC-CCs) are a relatively new class of powerful error control codes. The advantage of these codes over other capacity-approaching codes (e.g. Turbo codes and block-oriented LDPC codes) is that they can encode data blocks of arbitrary length. The goal of our project is to investigate high-performance software implementations of the encoding and decoding algorithms for LDPC-CCs. In particular, we are interested in efficient implementations for the XInC multithreaded microprocessor developed by Eleven Engineering Inc. (Edmonton). In communication systems, information must be transmitted reliably from end to end. However, due to channel distortions and inevitable noise sources, the quality of the received signals can be degraded to the point where bit errors occur in the received information. In order to detect and correct these errors, redundant information (i.e. check bits) can be added to the intended data (i.e. information bits) at the transmitter before the signal is sent into the channel. In his landmark 1948 paper, Claude Shannon showed that for any given channel bandwidth and signal power to noise power ratio (SNR), there exists a maximum bit rate at which information can be encoded over the channel and decoded without error at the receiver. Finding code constructions that reach the Shannon Limit has been a major challenge. Blockbased Low Density Parity Check codes (LDPC-BCs) were first proposed by Gallager in the early 1960s. Along with more recent Turbo Codes, the LDPC-BCs are among the few error control codes whose performances approach the Shannon Limit. Unfortunately, LDPC-BCs codes were not pursued after their initial invention due to the demanding computational complexity of the required decoding algorithm. Recent improvements in hardware speed have made LDPC-BCs practical, and these codes have been adopted by several recent communication standards including the DVB-S2 and the IEEE 802.3 10G BaseT standards. The LDPC-BCs have the disadvantage of requiring data that is partitioned into a fixed block size. In the late 1990’s, a new convolutional counterpart of the LDPC-BC codes, called the LDPC Convolutional Codes (LDPC-CCs), were proposed by Felstrom and Zigangirov. Compared with the LDPC-BCs, the LDPC-CCs have the following important advantages: I. As with any convolutional code, and unlike for block codes, the LDPC-CC’s can encode data blocks of arbitrary size. This is a key advantage in the transmission of data packets of variable size (e.g. Internet file transfers) or streaming media (e.g. video). II. LDPC-CC codes also have advantages in both the encoding and decoding steps. In an encoder for LDPC-BC’s, an entire block of arriving data must generally be stored to allow the parity bits to be calculated. In LDPC-CC’s, the required storage requirements for the encoder are much smaller, involving only recent data bits, recent code bits, and the encoder state. In an LDPC-BC decoder, by contrast, an entire codeword must be stored and processed iteratively as a whole, whereas in an LDPC-CC decoder, a pipeline of identical decoding steps processes the encoded data as it arrives in a continuous stream. Recent work at the University of Alberta has led to the first hardware implementations of LDPC-CC encoders and decoders using both Field-Programmable Gate Arrays (FPGAs) and semicustom Application-Specific Integrated Circuits (ASICs). The focus of our new work has been to attempt to efficiently exploit the parallel computing resources in commercially available microprocessors to implement efficient software-based decoders. The XInC chip from Eleven Engineering provides a multithreading environment with eight identical threads implemented directly in hardware. Computation occurs on a common RISC processing core, which is accessed in round-robin fashion by the eight threads. Each thread has its own set of registers and can be used as if it were running on a separate processor. Threads can share information by means of shared memory. This platform appeared to be a reasonably good match for the parallel structure of the decoding algorithm for LDPC-CCs. An emulator for XInC chip was developed to support algorithm development and performance evaluation. The emulator can run 8 independent execution threads, all 18 XInC instructions, all of the registers and some of the peripherals. The emulator can be used to investigate the potential benefits of additional hardware units to facilitate error control coding (e.g. new instructions, new functional units, or a memory mapped coprocessor) as well as the performance of alternative parallel implementations. Finally, evaluation boards from XInC can be used to measure the performance of parallel algorithm and alternative LDPC codes in real wireless environments. Several dB of coding gain over existing scheme is expected in this work. The practical benefits would include greater reliable transmission range, greater reliable bit rates as the same range, and lower required power to achieve an acceptable bit error rate. For example, the following figure illustrates how the range could be increased by changing the SNR (rho value). Figure 1: Range versus Coding Gain References: Claude Shannon, 1948, “A mathematical theory of communication,” Bell System Technical Journal, vol. 27, pp. 379-423 and 623-656. Jimenez-Felstrom and K. Sh. Zigangirov, 1999, “Time-varying periodic convolutional codes with lowdensity parity-check matrix,” IEEE Trans. Information Theory, vol. 45, no. 6, pp. 2181-2191. Robert G. Gallager, 1963, Low-Density Parity-Check Codes Eleven Engineering Inc. http:// www.elevenengineering.com/
منابع مشابه
VLSI Design of a Fully-Parallel High-Throughput Decoder for Turbo Gallager Codes
The most powerful channel coding schemes, namely those based on turbo codes and low-density parity-check (LDPC) Gallager codes, have in common the principle of iterative decoding. However, the relative coding structures and decoding algorithms are substantially different. This paper presents a 2048-bit, rate-1/2 soft decision decoder for a new class of codes known as Turbo Gallager Codes. These...
متن کاملA ”multi-user” Approach towards a Channel Decoder for Convolutional, Turbo and Ldpc Codes
In this paper we present the concept of a high-throughput multi-mode channel decoder architecture that consists of a tightly coupled array of independently programmable processing cores. Every core is capable of decoding low-density parity-check (LDPC), convolutional turbo (CTC) and convolutional codes (CC) either independently or jointly with other cores. This approach allows parallel handling...
متن کاملMultitree search decoding of linear codes
Tree search algorithms have a long history in computer science. In the coding literature, tree search algorithms have traditionally been used for decoding convolutional codes. Convolutional codes are linear codes with a special structure. Classic tree search decoders (most notably the sequential decoder) search one code tree in which the bits are ordered sequentially. We propose a multitree sea...
متن کاملAnalyzing the turbo decoder using the Gaussian approximation
In this paper, we introduce a simple technique for analyzing the iterative decoder that is broadly applicable to different classes of codes defined over graphs in certain fading as well as additive white Gaussian noise (AWGN) channels. The technique is based on the observation that the extrinsic information from constituent maximum a posteriori (MAP) decoders is well approximated by Gaussian ra...
متن کاملSearch Based Weighted Multi-Bit Flipping Algorithm for High-Performance Low-Complexity Decoding of LDPC Codes
In this paper, two new hybrid algorithms are proposed for decoding Low Density Parity Check (LDPC) codes. Original version of the proposed algorithms named Search Based Weighted Multi Bit Flipping (SWMBF). The main idea of these algorithms is flipping variable multi bits in each iteration, change in which leads to the syndrome vector with least hamming weight. To achieve this, the proposed algo...
متن کامل